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Abstract. A global, model-independent search for high-pr exotic phenomena is presented using 
927 pb^^ of CDF II data. The search algorithms employed in this analysis are ViSTA and Sleuth. 
These proceedings focus on ViSTA, including a description of the method and a summary of results. 

PACS. 12.60.-i Models beyond the standard model 



1 Introduction agreement by reasonable adjustments of the correction 

model may motivate a discovery claim. 

A model-independent search for new physics is presently 

well motivated. While the W boson, Z boson, and 

top quark represented very specific predictions of an 2 The method 

already well established Standard Model, and were 

therefore nearly guaranteed targets, the exciting physics The CDF detector, described elsewhere [J, records 

expected to lie beyond the Standard Model may as- collisions of protons and antiprotons at = 1.96 TeV. 

sume any of a number of different forms. Even within xhe Monte Carlo events composing the Standard Model 

the Minimal Supersymmetric Standard Model, differ- prediction are generated primarily using PYTHIA [5], 

ent points in the 105 dimensional parameter space cor- herwig m and madevent '4 . After generation, the 

respond to a large array of possible signatures. Standard Model events pass 'through a GEANT-based 

Performing a global search requires a concrete and simulation of the CDF detector, 
practical strategy The approach taken by most searches in each event, energetic and isolated "objects" — 
of more hmited scope involves selecting a proposed electrons (e), muons (/i), photons (7), taus (r), non-&- 
model of new physics from the existing literature and quark jets (j), 6-tagged jets (6), and missing transverse 
searching for its signature in data collected at the en- momentum {^t) — are identified with sufficiently large 
ergy frontier. An alternative approach takes a some- transverse momentum (pr > 17 GeV). Roughly two 
what broader view, imposing less restrictive assump- million data events with one or more sufficiently high- 
tions as to what the first signature of new physics objects are included in the analysis, 
may be. Rather than defining traditional "control" and Events so selected are partitioned into exclusive fi- 
"signal" regions, all regions of the data are considered ^al states according to reconstructed final state ob- 
to potentially harbor the first sign of new physics. Si- jects. The e+e" final state thus consists of aU events 
multaneously, all regions contribute information used containing exactly one positron, one electron, and no 
collectively to constrain the predicted Standard Model Qt^er reconstructed object. The partitioning is orthog- 
background. The Standard Model prediction consists Q^al, with each event associated with one and only one 
entirely of Monte Carlo events (except for estimation ^^^^1 state. Possible final states are defined algorithmi- 
of non-colhsion backgrounds , mcludmg cosmic rays and ^ally, and are dynamically created to accommodate all 
beam halo, modeled using data with few reconstructed g^g^ts: an observed event with seventeen muons would 
tracks). This prediction is compared with CDF data in prompt the creation of a corresponding final state, 
all high-pT final states, in a large number of kinematic ^ijig^^ ^^^Id then be included in subsequent analysis, 
variables constructed from 4- vector quantities . A min- Exclusive final states allow an algorithmic specification 
imal set of well motivated correction factors enters the ^ finite set of kinematic variables that make sense 
calculation of the Standard Model background, with foj. events in each final state. Applicable variables 
values adjusted under external constraints to mini- -^^ g^ch final state include object transverse momenta, 
mize global (large-scale) disagreement with data. Dis- p^i^r and azimuthal angles, angles between pairs of 
crepancies that persist in spite of efforts to achieve objects, masses of all object combinations, and a num- 
ber of additional specialized variables. With data and 

http://www.mit.edu/~gchouda/, gchouda@mit.edu Standard Model background events partitioned by the 
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same rule, the search for discrepancies can proceed sep- 
arately in each final state. 

Standard Model Monte Carlo events are adjusted 
by a minimal set of theoretical and experimental cor- 
rection factors that include the integrated luminosity, 
one fc-factoiQ for each of roughly twenty processes, ap- 
proximately twenty object (mis)identification proba- 
bilities, and four online trigger efficiencies. With only 
these 44 correction factors, a comparison is made be- 
tween data and Standard Model prediction in over 
three hundred exclusive final states and over ten thou- 
sand kinematic distributions. The goal of the ViSTA 
correction model is not necessarily a perfectly accurate 
estimation of the Standard Model background in all fi- 
nal states, but rather an estimation reliable enough to 
indicate if a signal of new physics is present. 

A global fit determines the values of the correction 
factors, using simultaneously information from all data 
entering the analysis, together with external informa- 
tion where applicable. The fit minimizes a binned x^j 
which is a function of the correction factors (s): 



(Data[/c] - SM[k] f 



^constraints' 



(1) 



Bins k loosely correspond to exclusive final states, with 
additional division in object transverse momentum (px) 
and pseudorapidity (r/). In each bin fc, the Standard 
Model background (SM[fc]) is a function of the values 
of the correction factors s, depending on the overall in- 
tegrated luminosity, fc-factors of contributing Standard 
Model processes, object identification efficiencies and 
misidentification rates, and trigger efficiencies. The term 
Xconstraints increascs if a correction factor assumes a 
value different from the value preferred by external 
sources of information, such as an NLO calculation. 

With correction factor values globally determined, 
a comparison is performed between data and Stan- 
dard Model prediction to highlight any remaining sig- 
nificant discrepancies. Discrepancies may prompt ad- 
ditional refinement of the Standard Model prediction 
or detector response if such adjustment is not inconsis- 
tent with existing experimental knowledge. The global 
fit and search for remaining discrepancies is repeated 
after each adjustment, testing the consistency of the 
adjustment with all available high-p^ data. Iteration 
occurs until either a clear case for new physics can be 
made, or there remain no discrepancies that may mo- 
tivate such a case. Judgement is used to implement 
only physically motivated improvements in this pro- 
cedure, rather than ad hoc modifications that remove 
discrepancies without physical reasoning. In this spirit, 
emphasis on physical understanding within ViSTA has 
resulted in a quantitative and unified understanding of 
the underlying physics responsible for the misidentifi- 
cation of jets as electrons, muons, taus, and photons 
at CDF. 



^ Here a fc-factor is defined as the ratio of the actual 
(unknown) Standard Model cross section and the (known) 
leading order cross section. 



3 Results of the Vista comparison 

The first 927 pb"^ of CDF II data populate 344 exclu- 
sive final states. The first ViSTA statistic quantifies the 
difference between the observed and predicted popu- 
lations of these final states (Fig. [T]-a). For each final 
state, the Poisson probability that the expected pop- 
ulation would fluctuate up to or above (or down to or 
below) the observed number of events is calculated. A 
trials factor associated with examining 344 final states 
reduces the significance of the observed discrepancies. 
The largest population discrepancy, corresponding to a 
2.3(7 deficit of data after this trials factor is accounted 
for, is not statistically significant. 

In addition to total final state populations, ViSTA 
examines shapes of kinematic distributions. In each fi- 
nal state. Vista algorithmically produces the distribu- 
tions of a large number of potentially informative kine- 
matic variables. The total number of distributions con- 
sidered in all final states is 16,486. The Kolmogorov- 
Smirnov (KS) test is used to evaluate the agreement 
of data with the Standard Model prediction in each 
kinematic variable. The distribution of these KS prob- 
abilities, converted into units of standard deviations 
(cr), is shown in Fig. [T]-b. A trials factor equal to the 
number of distributions considered reduces the signif- 
icance of any individual observed shape discrepancy. 

While the number of events observed in each fi- 
nal state does not result in a statistically significant 
discrepancy that might motivate a new physics claim, 
consideration of the shapes of kinematic distributions 
result in a few hundred shape discrepancies that re- 
main statistically significant even after accounting for 
the associated trials factor. These shape discrepancies 
can be generally categorized as manifestations of the 
modeling of the intrinsic transverse boost of the event, 
and modeling the angular separation between sublead- 
ing jets. 

Accurate modeling of the intrinsic boost {kr kick) 
of events produced at a hadron collider is a long-standing 
problem. The symptomatic ViSTA distributions include 
the total energy visible in the detector but not clus- 
tered into any specific reconstructed object; missing 
transverse energy in events where this missing trans- 
verse energy is not significant, such as in dijet, dipho- 
ton, and Z production; the projection of the vector 
summed momenta of all reconstructed objects along 
and perpendicular to the thrust axis in the event; and 
other related variables. Although a satisfactory solu- 
tion to this problem has not yet been obtained, ViSTA 
currently supplies a reasonably comprehensive cata- 
log of relevant experimental information from pp col- 
lisions. 

The mismodeling of the angular separation between 
subleading jets is shown most clearly in Fig. [51 Many 
other discrepant distributions derive from the effect 
shown in this \ow-pT final state, consisting of one cen- 
tral (I77I < 1) jet having px > 40 GeV and two ad- 
ditional reconstructed jets with \r]\ < 2.5 and pt > 
17 GeV. Derived discrepancies include the masses of 
individual jets, where jets in data are observed to be 
systematically more massive than jets from the Pythia 
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Fig. 1. (a) Vista population discrepancies, quantifying 
the difference between the number of events observed and 
predicted in each of the 344 ViSTA final states considered. 
Final states containing more data than Standard Model 
prediction populate the right side of the distribution, while 
final states containing fewer data events populate the left, 
(b) Vista shape discrepancies, quantifying the difference 
in shape between data and Standard Model prediction in 
16,486 kinematic variables. The horizontal axis ranges from 
agreement to disagreement in shape from left to right. In 
both (a) and (b) the black curve is the expected distribu- 
tion, obtained by drawing pseudo data from the Standard 
Model background. The horizontal axis in both (a) and 
(b) represents statistical significance, in units of standard 
deviations (a), before accounting for the associated trials 
factor (see text). 



Standard Model prediction. Although no complete, quan- 
titative understanding has yet been achieved, pursuit 
of a showering-based explanation is ongoing. 



4 Conclusion 

These proceedings have motivated and briefly outlined 
the Vista global analysis, together with the result 
obtained on 927 pb"^ of CDF II data. This analy- 
sis represents the first model-independent search in 
hadron-hadron collider data of this scope, including 
16,486 kinematic variables in 344 populated exclusive 



Fig. 2. A shape discrepancy highlighted by ViSTA in the fi- 
nal state consisting of exactly three reconstructed jets with 
\'ri\ < 2.5 and pr > 17 GeV, and with one of the jets satis- 
fying l^yl < 1 and pr > 40 GeV. The discrepancy is clearly 
statistically significant, with statistical error bars smaller 
than the size of the data points. The vertical axis shows the 
number of events per bin, with the horizontal axis show- 
ing the angular separation {AR = a/ Arj"^ + Scfy^) between 
the second and third jets, where the jets are ordered ac- 
cording to decreasing transverse momentum. The region 
zi7?(j2, js) ^ 2 is populated primarily by initial state ra- 
diation, and here the Standard Model prediction can to 
some extent be adjusted; the region zi7?(j2, js) 2 is dom- 
inated by final state radiation, the description of which is 
constrained by data from LEP 1. 



final states defined by seven reconstructed objects (e, 
/i, r, 7, j, b, i^)). This result should not be construed 
as having proven that there is no new physics hiding 
in the Tevatron data; merely that this analysis has 
not revealed an indication of a discrepancy appearing 
to motivate a new physics claim. New physics above 
the electroweak scale appearing with low cross section 
represents a more specific target of Sleuth 
discussed in a companion proceedings [8]. 
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